Overview

Dataset statistics

Number of variables26
Number of observations2098
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 MiB
Average record size in memory623.4 B

Variable types

NUM14
CAT11
BOOL1

Reproduction

Analysis started2020-06-03 21:17:31.744556
Analysis finished2020-06-03 21:18:05.482476
Duration33.74 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

JobRole is highly correlated with DepartmentHigh correlation
Department is highly correlated with JobRoleHigh correlation
NumCompaniesWorked has 265 (12.6%) zeros Zeros
TrainingTimesLastYear has 94 (4.5%) zeros Zeros
YearsAtCompany has 84 (4.0%) zeros Zeros
YearsInCurrentRole has 436 (20.8%) zeros Zeros
YearsSinceLastPromotion has 881 (42.0%) zeros Zeros
YearsWithCurrManager has 495 (23.6%) zeros Zeros

Variables

df_index
Real number (ℝ≥0)

Distinct count1628
Unique (%)77.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean683.7907530981887
Minimum0
Maximum1627
Zeros2
Zeros (%)0.1%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile52
Q1262
median578.5
Q31102.75
95-th percentile1522.15
Maximum1627
Range1627
Interquartile range (IQR)840.75

Descriptive statistics

Standard deviation483.6309562
Coefficient of variation (CV)0.7072791699
Kurtosis-1.171855258
Mean683.7907531
Median Absolute Deviation (MAD)386
Skewness0.3763172906
Sum1434593
Variance233898.9018
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
020.1%
 
14720.1%
 
17120.1%
 
16920.1%
 
16720.1%
 
16520.1%
 
16320.1%
 
16120.1%
 
15920.1%
 
15720.1%
 
15520.1%
 
15320.1%
 
15120.1%
 
14920.1%
 
14520.1%
 
11720.1%
 
14320.1%
 
14120.1%
 
13920.1%
 
13720.1%
 
13520.1%
 
13320.1%
 
13120.1%
 
12920.1%
 
12720.1%
 
Other values (1603)204897.6%
 
ValueCountFrequency (%) 
020.1%
 
120.1%
 
220.1%
 
320.1%
 
420.1%
 
520.1%
 
620.1%
 
720.1%
 
820.1%
 
920.1%
 
ValueCountFrequency (%) 
16271< 0.1%
 
16261< 0.1%
 
16251< 0.1%
 
16241< 0.1%
 
16231< 0.1%
 
16221< 0.1%
 
16211< 0.1%
 
16201< 0.1%
 
16191< 0.1%
 
16181< 0.1%
 

Age
Real number (ℝ≥0)

Distinct count43
Unique (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.95138226882745
Minimum18
Maximum60
Zeros0
Zeros (%)0.0%
Memory size16.5 KiB

Quantile statistics

Minimum18
5-th percentile21.85
Q129
median35
Q342
95-th percentile54
Maximum60
Range42
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.427105093
Coefficient of variation (CV)0.262218154
Kurtosis-0.4441986877
Mean35.95138227
Median Absolute Deviation (MAD)6
Skewness0.4366286152
Sum75426
Variance88.87031044
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
311175.6%
 
291085.1%
 
351024.9%
 
341014.8%
 
32894.2%
 
30884.2%
 
33823.9%
 
36813.9%
 
26793.8%
 
28763.6%
 
37703.3%
 
40693.3%
 
38623.0%
 
41602.9%
 
27562.7%
 
42542.6%
 
39542.6%
 
25502.4%
 
24502.4%
 
45492.3%
 
46492.3%
 
50462.2%
 
44452.1%
 
43361.7%
 
47361.7%
 
Other values (18)38918.5%
 
ValueCountFrequency (%) 
18160.8%
 
19251.2%
 
20311.5%
 
21331.6%
 
22201.0%
 
23221.0%
 
24502.4%
 
25502.4%
 
26793.8%
 
27562.7%
 
ValueCountFrequency (%) 
6050.2%
 
59100.5%
 
58221.0%
 
5740.2%
 
56180.9%
 
55341.6%
 
54180.9%
 
53271.3%
 
52221.0%
 
51271.3%
 

BusinessTravel
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
Travel_Rarely
1439
Travel_Frequently
481
Non-Travel
 
178
ValueCountFrequency (%) 
Travel_Rarely143968.6%
 
Travel_Frequently48122.9%
 
Non-Travel1788.5%
 

Length

Max length17
Median length13
Mean length13.66253575
Min length10

Overview of Unicode Properties

Unique unicode characters17
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e449915.7%
 
r401814.0%
 
l401814.0%
 
a353712.3%
 
T20987.3%
 
v20987.3%
 
_19206.7%
 
y19206.7%
 
R14395.0%
 
n6592.3%
 
F4811.7%
 
q4811.7%
 
u4811.7%
 
t4811.7%
 
N1780.6%
 
o1780.6%
 
-1780.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter2237078.0%
 
Uppercase Letter419614.6%
 
Connector Punctuation19206.7%
 
Dash Punctuation1780.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
T209850.0%
 
R143934.3%
 
F48111.5%
 
N1784.2%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e449920.1%
 
r401818.0%
 
l401818.0%
 
a353715.8%
 
v20989.4%
 
y19208.6%
 
n6592.9%
 
q4812.2%
 
u4812.2%
 
t4812.2%
 
o1780.8%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-178100.0%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_1920100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2656692.7%
 
Common20987.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e449916.9%
 
r401815.1%
 
l401815.1%
 
a353713.3%
 
T20987.9%
 
v20987.9%
 
y19207.2%
 
R14395.4%
 
n6592.5%
 
F4811.8%
 
q4811.8%
 
u4811.8%
 
t4811.8%
 
N1780.7%
 
o1780.7%
 

Most frequent Common characters

ValueCountFrequency (%) 
_192091.5%
 
-1788.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII28664100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e449915.7%
 
r401814.0%
 
l401814.0%
 
a353712.3%
 
T20987.3%
 
v20987.3%
 
_19206.7%
 
y19206.7%
 
R14395.0%
 
n6592.3%
 
F4811.7%
 
q4811.7%
 
u4811.7%
 
t4811.7%
 
N1780.6%
 
o1780.6%
 
-1780.6%
 

Department
Categorical

HIGH CORRELATION

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
Research & Development
1293
Sales
706
Human Resources
 
99
ValueCountFrequency (%) 
Research & Development129361.6%
 
Sales70633.7%
 
Human Resources994.7%
 

Length

Max length22
Median length22
Mean length15.94899905
Min length5

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e736922.0%
 
26858.0%
 
s21976.6%
 
a20986.3%
 
l19996.0%
 
R13924.2%
 
r13924.2%
 
c13924.2%
 
o13924.2%
 
m13924.2%
 
n13924.2%
 
h12933.9%
 
&12933.9%
 
D12933.9%
 
v12933.9%
 
p12933.9%
 
t12933.9%
 
S7062.1%
 
u1980.6%
 
H990.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter2599377.7%
 
Uppercase Letter349010.4%
 
Space Separator26858.0%
 
Other Punctuation12933.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
R139239.9%
 
D129337.0%
 
S70620.2%
 
H992.8%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e736928.3%
 
s21978.5%
 
a20988.1%
 
l19997.7%
 
r13925.4%
 
c13925.4%
 
o13925.4%
 
m13925.4%
 
n13925.4%
 
h12935.0%
 
v12935.0%
 
p12935.0%
 
t12935.0%
 
u1980.8%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
2685100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
&1293100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2948388.1%
 
Common397811.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e736925.0%
 
s21977.5%
 
a20987.1%
 
l19996.8%
 
R13924.7%
 
r13924.7%
 
c13924.7%
 
o13924.7%
 
m13924.7%
 
n13924.7%
 
h12934.4%
 
D12934.4%
 
v12934.4%
 
p12934.4%
 
t12934.4%
 
S7062.4%
 
u1980.7%
 
H990.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
268567.5%
 
&129332.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII33461100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e736922.0%
 
26858.0%
 
s21976.6%
 
a20986.3%
 
l19996.0%
 
R13924.2%
 
r13924.2%
 
c13924.2%
 
o13924.2%
 
m13924.2%
 
n13924.2%
 
h12933.9%
 
&12933.9%
 
D12933.9%
 
v12933.9%
 
p12933.9%
 
t12933.9%
 
S7062.1%
 
u1980.6%
 
H990.3%
 

DistanceFromHome
Real number (ℝ≥0)

Distinct count29
Unique (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.63632030505243
Minimum1
Maximum29
Zeros0
Zeros (%)0.0%
Memory size16.5 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median8
Q315
95-th percentile26
Maximum29
Range28
Interquartile range (IQR)13

Descriptive statistics

Standard deviation8.257466164
Coefficient of variation (CV)0.8569107193
Kurtosis-0.432162311
Mean9.636320305
Median Absolute Deviation (MAD)6
Skewness0.8657990186
Sum20217
Variance68.18574745
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
128013.3%
 
227913.3%
 
91497.1%
 
101185.6%
 
31165.5%
 
81045.0%
 
71045.0%
 
5974.6%
 
4964.6%
 
6713.4%
 
24562.7%
 
29472.2%
 
16442.1%
 
25412.0%
 
11412.0%
 
18381.8%
 
15381.8%
 
20371.8%
 
12361.7%
 
23351.7%
 
22351.7%
 
26331.6%
 
14331.6%
 
17321.5%
 
13311.5%
 
Other values (4)1075.1%
 
ValueCountFrequency (%) 
128013.3%
 
227913.3%
 
31165.5%
 
4964.6%
 
5974.6%
 
6713.4%
 
71045.0%
 
81045.0%
 
91497.1%
 
101185.6%
 
ValueCountFrequency (%) 
29472.2%
 
28311.5%
 
27160.8%
 
26331.6%
 
25412.0%
 
24562.7%
 
23351.7%
 
22351.7%
 
21301.4%
 
20371.8%
 

Education
Real number (ℝ≥0)

Distinct count5
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8913250714966634
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size16.5 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile4
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.019416207
Coefficient of variation (CV)0.3525775142
Kurtosis-0.5755315691
Mean2.891325071
Median Absolute Deviation (MAD)1
Skewness-0.2950445844
Sum6066
Variance1.039209402
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
382439.3%
 
455826.6%
 
240619.4%
 
125011.9%
 
5602.9%
 
ValueCountFrequency (%) 
125011.9%
 
240619.4%
 
382439.3%
 
455826.6%
 
5602.9%
 
ValueCountFrequency (%) 
5602.9%
 
455826.6%
 
382439.3%
 
240619.4%
 
125011.9%
 

EducationField
Categorical

Distinct count6
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
Life Sciences
826
Medical
652
Marketing
251
Technical Degree
212
Other
 
110
ValueCountFrequency (%) 
Life Sciences82639.4%
 
Medical65231.1%
 
Marketing25112.0%
 
Technical Degree21210.1%
 
Other1105.2%
 
Human Resources472.2%
 

Length

Max length16
Median length13
Mean length10.58531935
Min length5

Overview of Unicode Properties

Unique unicode characters26
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e443320.0%
 
c277512.5%
 
i276712.5%
 
n13366.0%
 
a11625.2%
 
10854.9%
 
s9204.1%
 
M9034.1%
 
l8643.9%
 
L8263.7%
 
f8263.7%
 
S8263.7%
 
d6522.9%
 
r6202.8%
 
g4632.1%
 
t3611.6%
 
h3221.4%
 
k2511.1%
 
T2121.0%
 
D2121.0%
 
O1100.5%
 
u940.4%
 
H470.2%
 
m470.2%
 
R470.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1794080.8%
 
Uppercase Letter318314.3%
 
Space Separator10854.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M90328.4%
 
L82626.0%
 
S82626.0%
 
T2126.7%
 
D2126.7%
 
O1103.5%
 
H471.5%
 
R471.5%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e443324.7%
 
c277515.5%
 
i276715.4%
 
n13367.4%
 
a11626.5%
 
s9205.1%
 
l8644.8%
 
f8264.6%
 
d6523.6%
 
r6203.5%
 
g4632.6%
 
t3612.0%
 
h3221.8%
 
k2511.4%
 
u940.5%
 
m470.3%
 
o470.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1085100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2112395.1%
 
Common10854.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e443321.0%
 
c277513.1%
 
i276713.1%
 
n13366.3%
 
a11625.5%
 
s9204.4%
 
M9034.3%
 
l8644.1%
 
L8263.9%
 
f8263.9%
 
S8263.9%
 
d6523.1%
 
r6202.9%
 
g4632.2%
 
t3611.7%
 
h3221.5%
 
k2511.2%
 
T2121.0%
 
D2121.0%
 
O1100.5%
 
u940.4%
 
H470.2%
 
m470.2%
 
R470.2%
 
o470.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
1085100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII22208100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e443320.0%
 
c277512.5%
 
i276712.5%
 
n13366.0%
 
a11625.2%
 
10854.9%
 
s9204.1%
 
M9034.1%
 
l8643.9%
 
L8263.7%
 
f8263.7%
 
S8263.7%
 
d6522.9%
 
r6202.8%
 
g4632.1%
 
t3611.6%
 
h3221.4%
 
k2511.1%
 
T2121.0%
 
D2121.0%
 
O1100.5%
 
u940.4%
 
H470.2%
 
m470.2%
 
R470.2%
 
Distinct count4
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
3
629
4
610
1
464
2
395
ValueCountFrequency (%) 
362930.0%
 
461029.1%
 
146422.1%
 
239518.8%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
362930.0%
 
461029.1%
 
146422.1%
 
239518.8%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2098100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
362930.0%
 
461029.1%
 
146422.1%
 
239518.8%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2098100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
362930.0%
 
461029.1%
 
146422.1%
 
239518.8%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2098100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
362930.0%
 
461029.1%
 
146422.1%
 
239518.8%
 

Gender
Categorical

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
Male
1274
Female
824
ValueCountFrequency (%) 
Male127460.7%
 
Female82439.3%
 

Length

Max length6
Median length4
Mean length4.78551001
Min length4

Overview of Unicode Properties

Unique unicode characters6
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e292229.1%
 
a209820.9%
 
l209820.9%
 
M127412.7%
 
F8248.2%
 
m8248.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter794279.1%
 
Uppercase Letter209820.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M127460.7%
 
F82439.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e292236.8%
 
a209826.4%
 
l209826.4%
 
m82410.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin10040100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e292229.1%
 
a209820.9%
 
l209820.9%
 
M127412.7%
 
F8248.2%
 
m8248.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII10040100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e292229.1%
 
a209820.9%
 
l209820.9%
 
M127412.7%
 
F8248.2%
 
m8248.2%
 

JobInvolvement
Categorical

Distinct count4
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
3
1200
2
563
4
 
180
1
 
155
ValueCountFrequency (%) 
3120057.2%
 
256326.8%
 
41808.6%
 
11557.4%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
3120057.2%
 
256326.8%
 
41808.6%
 
11557.4%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2098100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
3120057.2%
 
256326.8%
 
41808.6%
 
11557.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2098100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
3120057.2%
 
256326.8%
 
41808.6%
 
11557.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2098100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
3120057.2%
 
256326.8%
 
41808.6%
 
11557.4%
 

JobRole
Categorical

HIGH CORRELATION

Distinct count9
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
Sales Executive
474
Research Scientist
424
Laboratory Technician
403
Sales Representative
191
Manufacturing Director
169
Other values (4)
437
ValueCountFrequency (%) 
Sales Executive47422.6%
 
Research Scientist42420.2%
 
Laboratory Technician40319.2%
 
Sales Representative1919.1%
 
Manufacturing Director1698.1%
 
Healthcare Representative1517.2%
 
Manager1185.6%
 
Human Resources884.2%
 
Research Director803.8%
 

Length

Max length25
Median length18
Mean length18.12392755
Min length7

Overview of Unicode Properties

Unique unicode characters29
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e566114.9%
 
a36849.7%
 
t29787.8%
 
i28887.6%
 
c28657.5%
 
r26767.0%
 
n21165.6%
 
s21115.6%
 
19805.2%
 
o11433.0%
 
S10892.9%
 
h10582.8%
 
u9882.6%
 
R9342.5%
 
l8162.1%
 
v8162.1%
 
E4741.2%
 
x4741.2%
 
L4031.1%
 
b4031.1%
 
y4031.1%
 
T4031.1%
 
p3420.9%
 
M2870.8%
 
g2870.8%
 
Other values (4)7452.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter3196684.1%
 
Uppercase Letter407810.7%
 
Space Separator19805.2%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S108926.7%
 
R93422.9%
 
E47411.6%
 
L4039.9%
 
T4039.9%
 
M2877.0%
 
D2496.1%
 
H2395.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e566117.7%
 
a368411.5%
 
t29789.3%
 
i28889.0%
 
c28659.0%
 
r26768.4%
 
n21166.6%
 
s21116.6%
 
o11433.6%
 
h10583.3%
 
u9883.1%
 
l8162.6%
 
v8162.6%
 
x4741.5%
 
b4031.3%
 
y4031.3%
 
p3421.1%
 
g2870.9%
 
f1690.5%
 
m880.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1980100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin3604494.8%
 
Common19805.2%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e566115.7%
 
a368410.2%
 
t29788.3%
 
i28888.0%
 
c28657.9%
 
r26767.4%
 
n21165.9%
 
s21115.9%
 
o11433.2%
 
S10893.0%
 
h10582.9%
 
u9882.7%
 
R9342.6%
 
l8162.3%
 
v8162.3%
 
E4741.3%
 
x4741.3%
 
L4031.1%
 
b4031.1%
 
y4031.1%
 
T4031.1%
 
p3420.9%
 
M2870.8%
 
g2870.8%
 
D2490.7%
 
Other values (3)4961.4%
 

Most frequent Common characters

ValueCountFrequency (%) 
1980100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII38024100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e566114.9%
 
a36849.7%
 
t29787.8%
 
i28887.6%
 
c28657.5%
 
r26767.0%
 
n21165.6%
 
s21115.6%
 
19805.2%
 
o11433.0%
 
S10892.9%
 
h10582.8%
 
u9882.6%
 
R9342.5%
 
l8162.1%
 
v8162.1%
 
E4741.2%
 
x4741.2%
 
L4031.1%
 
b4031.1%
 
y4031.1%
 
T4031.1%
 
p3420.9%
 
M2870.8%
 
g2870.8%
 
Other values (4)7452.0%
 

JobSatisfaction
Categorical

Distinct count4
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
3
650
4
587
1
457
2
404
ValueCountFrequency (%) 
365031.0%
 
458728.0%
 
145721.8%
 
240419.3%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
365031.0%
 
458728.0%
 
145721.8%
 
240419.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2098100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
365031.0%
 
458728.0%
 
145721.8%
 
240419.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2098100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
365031.0%
 
458728.0%
 
145721.8%
 
240419.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2098100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
365031.0%
 
458728.0%
 
145721.8%
 
240419.3%
 

MaritalStatus
Categorical

Distinct count3
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
Married
885
Single
786
Divorced
427
ValueCountFrequency (%) 
Married88542.2%
 
Single78637.5%
 
Divorced42720.4%
 

Length

Max length8
Median length7
Mean length6.828884652
Min length6

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
r219715.3%
 
i209814.6%
 
e209814.6%
 
d13129.2%
 
M8856.2%
 
a8856.2%
 
S7865.5%
 
n7865.5%
 
g7865.5%
 
l7865.5%
 
D4273.0%
 
v4273.0%
 
o4273.0%
 
c4273.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter1222985.4%
 
Uppercase Letter209814.6%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
M88542.2%
 
S78637.5%
 
D42720.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
r219718.0%
 
i209817.2%
 
e209817.2%
 
d131210.7%
 
a8857.2%
 
n7866.4%
 
g7866.4%
 
l7866.4%
 
v4273.5%
 
o4273.5%
 
c4273.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin14327100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
r219715.3%
 
i209814.6%
 
e209814.6%
 
d13129.2%
 
M8856.2%
 
a8856.2%
 
S7865.5%
 
n7865.5%
 
g7865.5%
 
l7865.5%
 
D4273.0%
 
v4273.0%
 
o4273.0%
 
c4273.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII14327100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
r219715.3%
 
i209814.6%
 
e209814.6%
 
d13129.2%
 
M8856.2%
 
a8856.2%
 
S7865.5%
 
n7865.5%
 
g7865.5%
 
l7865.5%
 
D4273.0%
 
v4273.0%
 
o4273.0%
 
c4273.0%
 

MonthlyIncome
Real number (ℝ≥0)

Distinct count1349
Unique (%)64.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5964.597235462345
Minimum1009
Maximum19999
Zeros0
Zeros (%)0.0%
Memory size16.5 KiB

Quantile statistics

Minimum1009
5-th percentile2058
Q12686
median4477.5
Q37477.25
95-th percentile16966.2
Maximum19999
Range18990
Interquartile range (IQR)4791.25

Descriptive statistics

Standard deviation4447.984026
Coefficient of variation (CV)0.7457308265
Kurtosis1.635826299
Mean5964.597235
Median Absolute Deviation (MAD)1977.5
Skewness1.515274735
Sum12513725
Variance19784561.9
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2404110.5%
 
5346100.5%
 
234280.4%
 
274170.3%
 
238070.3%
 
261070.3%
 
295660.3%
 
274360.3%
 
340760.3%
 
297360.3%
 
599360.3%
 
428460.3%
 
1060960.3%
 
251560.3%
 
494160.3%
 
204460.3%
 
213260.3%
 
523860.3%
 
237760.3%
 
234060.3%
 
288660.3%
 
229360.3%
 
236260.3%
 
265760.3%
 
269360.3%
 
Other values (1324)193492.2%
 
ValueCountFrequency (%) 
100950.2%
 
10511< 0.1%
 
10521< 0.1%
 
108150.2%
 
10911< 0.1%
 
11021< 0.1%
 
111850.2%
 
11291< 0.1%
 
12001< 0.1%
 
12231< 0.1%
 
ValueCountFrequency (%) 
199991< 0.1%
 
199731< 0.1%
 
199431< 0.1%
 
199261< 0.1%
 
1985950.2%
 
198471< 0.1%
 
198451< 0.1%
 
198331< 0.1%
 
197401< 0.1%
 
197171< 0.1%
 

NumCompaniesWorked
Real number (ℝ≥0)

ZEROS

Distinct count10
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7411820781696856
Minimum0
Maximum9
Zeros265
Zeros (%)12.6%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q34
95-th percentile8
Maximum9
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.54220098
Coefficient of variation (CV)0.9274104774
Kurtosis-0.124867917
Mean2.741182078
Median Absolute Deviation (MAD)1
Skewness0.9934920768
Sum5751
Variance6.462785822
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
177737.0%
 
026512.6%
 
31999.5%
 
21949.2%
 
41919.1%
 
61185.6%
 
71145.4%
 
5954.5%
 
9844.0%
 
8612.9%
 
ValueCountFrequency (%) 
026512.6%
 
177737.0%
 
21949.2%
 
31999.5%
 
41919.1%
 
5954.5%
 
61185.6%
 
71145.4%
 
8612.9%
 
9844.0%
 
ValueCountFrequency (%) 
9844.0%
 
8612.9%
 
71145.4%
 
61185.6%
 
5954.5%
 
41919.1%
 
31999.5%
 
21949.2%
 
177737.0%
 
026512.6%
 

OverTime
Boolean

Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
No
1338
Yes
760
ValueCountFrequency (%) 
No133863.8%
 
Yes76036.2%
 

PercentSalaryHike
Real number (ℝ≥0)

Distinct count15
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.194470924690181
Minimum11
Maximum25
Zeros0
Zeros (%)0.0%
Memory size16.5 KiB

Quantile statistics

Minimum11
5-th percentile11
Q112
median14
Q318
95-th percentile22
Maximum25
Range14
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.688558387
Coefficient of variation (CV)0.2427566189
Kurtosis-0.3107657487
Mean15.19447092
Median Absolute Deviation (MAD)2
Skewness0.822371698
Sum31878
Variance13.60546298
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1131815.2%
 
1228613.6%
 
1328513.6%
 
1426512.6%
 
151537.3%
 
171266.0%
 
161225.8%
 
181215.8%
 
19964.6%
 
22884.2%
 
20713.4%
 
21683.2%
 
23401.9%
 
24371.8%
 
25221.0%
 
ValueCountFrequency (%) 
1131815.2%
 
1228613.6%
 
1328513.6%
 
1426512.6%
 
151537.3%
 
161225.8%
 
171266.0%
 
181215.8%
 
19964.6%
 
20713.4%
 
ValueCountFrequency (%) 
25221.0%
 
24371.8%
 
23401.9%
 
22884.2%
 
21683.2%
 
20713.4%
 
19964.6%
 
181215.8%
 
171266.0%
 
161225.8%
 
Distinct count2
Unique (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
3
1772
4
 
326
ValueCountFrequency (%) 
3177284.5%
 
432615.5%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters2
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
3177284.5%
 
432615.5%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2098100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
3177284.5%
 
432615.5%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2098100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
3177284.5%
 
432615.5%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2098100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
3177284.5%
 
432615.5%
 

StockOptionLevel
Categorical

Distinct count4
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size16.5 KiB
0
1039
1
740
2
 
194
3
 
125
ValueCountFrequency (%) 
0103949.5%
 
174035.3%
 
21949.2%
 
31256.0%
 

Length

Max length1
Median length1
Mean length1
Min length1

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories (?)1
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0103949.5%
 
174035.3%
 
21949.2%
 
31256.0%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number2098100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0103949.5%
 
174035.3%
 
21949.2%
 
31256.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common2098100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
0103949.5%
 
174035.3%
 
21949.2%
 
31256.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2098100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0103949.5%
 
174035.3%
 
21949.2%
 
31256.0%
 

TotalWorkingYears
Real number (ℝ≥0)

Distinct count40
Unique (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.27121067683508
Minimum0
Maximum40
Zeros19
Zeros (%)0.9%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile1
Q15
median9
Q314
95-th percentile26
Maximum40
Range40
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.581151524
Coefficient of variation (CV)0.7380971691
Kurtosis1.150350782
Mean10.27121068
Median Absolute Deviation (MAD)4
Skewness1.166355618
Sum21549
Variance57.47385843
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1025412.1%
 
11858.8%
 
61738.2%
 
81557.4%
 
71336.3%
 
51286.1%
 
91205.7%
 
4994.7%
 
3743.5%
 
2673.2%
 
12643.1%
 
15562.7%
 
11482.3%
 
17452.1%
 
14432.0%
 
18432.0%
 
16412.0%
 
13401.9%
 
20381.8%
 
21341.6%
 
19341.6%
 
22291.4%
 
24261.2%
 
23221.0%
 
0190.9%
 
Other values (15)1286.1%
 
ValueCountFrequency (%) 
0190.9%
 
11858.8%
 
2673.2%
 
3743.5%
 
4994.7%
 
51286.1%
 
61738.2%
 
71336.3%
 
81557.4%
 
91205.7%
 
ValueCountFrequency (%) 
4020.1%
 
381< 0.1%
 
3740.2%
 
3660.3%
 
3530.1%
 
3490.4%
 
3370.3%
 
3290.4%
 
3190.4%
 
3070.3%
 

TrainingTimesLastYear
Real number (ℝ≥0)

ZEROS

Distinct count7
Unique (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7430886558627265
Minimum0
Maximum6
Zeros94
Zeros (%)4.5%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q33
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.282010221
Coefficient of variation (CV)0.467360112
Kurtosis0.5149752082
Mean2.743088656
Median Absolute Deviation (MAD)1
Skewness0.476846861
Sum5755
Variance1.643550208
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
279537.9%
 
367932.4%
 
41878.9%
 
51637.8%
 
11034.9%
 
0944.5%
 
6773.7%
 
ValueCountFrequency (%) 
0944.5%
 
11034.9%
 
279537.9%
 
367932.4%
 
41878.9%
 
51637.8%
 
6773.7%
 
ValueCountFrequency (%) 
6773.7%
 
51637.8%
 
41878.9%
 
367932.4%
 
279537.9%
 
11034.9%
 
0944.5%
 

YearsAtCompany
Real number (ℝ≥0)

ZEROS

Distinct count37
Unique (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.311725452812202
Minimum0
Maximum40
Zeros84
Zeros (%)4.0%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median5
Q39
95-th percentile20
Maximum40
Range40
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.929169085
Coefficient of variation (CV)0.9393895742
Kurtosis4.544391039
Mean6.311725453
Median Absolute Deviation (MAD)3
Skewness1.878901104
Sum13242
Variance35.15504604
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
133516.0%
 
524411.6%
 
221510.2%
 
31808.6%
 
101607.6%
 
41547.3%
 
71185.6%
 
81085.1%
 
91065.1%
 
61045.0%
 
0844.0%
 
13321.5%
 
11321.5%
 
20271.3%
 
14261.2%
 
15241.1%
 
21180.9%
 
22150.7%
 
19150.7%
 
12140.7%
 
17130.6%
 
18130.6%
 
16120.6%
 
24100.5%
 
3390.4%
 
Other values (12)301.4%
 
ValueCountFrequency (%) 
0844.0%
 
133516.0%
 
221510.2%
 
31808.6%
 
41547.3%
 
524411.6%
 
61045.0%
 
71185.6%
 
81085.1%
 
91065.1%
 
ValueCountFrequency (%) 
401< 0.1%
 
371< 0.1%
 
3620.1%
 
341< 0.1%
 
3390.4%
 
3230.1%
 
3130.1%
 
301< 0.1%
 
2920.1%
 
2720.1%
 

YearsInCurrentRole
Real number (ℝ≥0)

ZEROS

Distinct count19
Unique (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7545281220209725
Minimum0
Maximum18
Zeros436
Zeros (%)20.8%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q37
95-th percentile10
Maximum18
Range18
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.485687199
Coefficient of variation (CV)0.9283955496
Kurtosis0.7071793168
Mean3.754528122
Median Absolute Deviation (MAD)2
Skewness1.034073023
Sum7877
Variance12.15001525
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
256026.7%
 
043620.8%
 
729414.0%
 
31838.7%
 
41406.7%
 
81095.2%
 
1974.6%
 
9793.8%
 
6412.0%
 
5401.9%
 
10331.6%
 
11221.0%
 
13180.9%
 
12140.7%
 
14110.5%
 
1580.4%
 
1670.3%
 
1740.2%
 
1820.1%
 
ValueCountFrequency (%) 
043620.8%
 
1974.6%
 
256026.7%
 
31838.7%
 
41406.7%
 
5401.9%
 
6412.0%
 
729414.0%
 
81095.2%
 
9793.8%
 
ValueCountFrequency (%) 
1820.1%
 
1740.2%
 
1670.3%
 
1580.4%
 
14110.5%
 
13180.9%
 
12140.7%
 
11221.0%
 
10331.6%
 
9793.8%
 

YearsSinceLastPromotion
Real number (ℝ≥0)

ZEROS

Distinct count16
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0972354623450906
Minimum0
Maximum15
Zeros881
Zeros (%)42.0%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile9
Maximum15
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.169219208
Coefficient of variation (CV)1.511141341
Kurtosis3.792262367
Mean2.097235462
Median Absolute Deviation (MAD)1
Skewness2.019676097
Sum4400
Variance10.04395039
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
088142.0%
 
147722.7%
 
224311.6%
 
71245.9%
 
4733.5%
 
3643.1%
 
5492.3%
 
6482.3%
 
9291.4%
 
11281.3%
 
8180.9%
 
15170.8%
 
13140.7%
 
14130.6%
 
12100.5%
 
10100.5%
 
ValueCountFrequency (%) 
088142.0%
 
147722.7%
 
224311.6%
 
3643.1%
 
4733.5%
 
5492.3%
 
6482.3%
 
71245.9%
 
8180.9%
 
9291.4%
 
ValueCountFrequency (%) 
15170.8%
 
14130.6%
 
13140.7%
 
12100.5%
 
11281.3%
 
10100.5%
 
9291.4%
 
8180.9%
 
71245.9%
 
6482.3%
 

YearsWithCurrManager
Real number (ℝ≥0)

ZEROS

Distinct count18
Unique (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7106768350810295
Minimum0
Maximum17
Zeros495
Zeros (%)23.6%
Memory size16.5 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q37
95-th percentile10
Maximum17
Range17
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.507698519
Coefficient of variation (CV)0.9452988429
Kurtosis0.3110244058
Mean3.710676835
Median Absolute Deviation (MAD)2
Skewness0.9297415982
Sum7785
Variance12.3039489
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
049523.6%
 
248022.9%
 
729213.9%
 
31949.2%
 
81356.4%
 
41266.0%
 
11085.1%
 
9723.4%
 
5391.9%
 
6371.8%
 
10351.7%
 
11261.2%
 
12180.9%
 
13140.7%
 
14130.6%
 
1770.3%
 
1550.2%
 
1620.1%
 
ValueCountFrequency (%) 
049523.6%
 
11085.1%
 
248022.9%
 
31949.2%
 
41266.0%
 
5391.9%
 
6371.8%
 
729213.9%
 
81356.4%
 
9723.4%
 
ValueCountFrequency (%) 
1770.3%
 
1620.1%
 
1550.2%
 
14130.6%
 
13140.7%
 
12180.9%
 
11261.2%
 
10351.7%
 
9723.4%
 
81356.4%
 

CommunicationSkill
Real number (ℝ≥0)

Distinct count5
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.1167778836987607
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size16.5 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.409889016
Coefficient of variation (CV)0.4523546652
Kurtosis-1.288031349
Mean3.116777884
Median Absolute Deviation (MAD)1
Skewness-0.1023047057
Sum6539
Variance1.987787038
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
547022.4%
 
444421.2%
 
341219.6%
 
240519.3%
 
136717.5%
 
ValueCountFrequency (%) 
136717.5%
 
240519.3%
 
341219.6%
 
444421.2%
 
547022.4%
 
ValueCountFrequency (%) 
547022.4%
 
444421.2%
 
341219.6%
 
240519.3%
 
136717.5%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

df_indexAgeBusinessTravelDepartmentDistanceFromHomeEducationEducationFieldEnvironmentSatisfactionGenderJobInvolvementJobRoleJobSatisfactionMaritalStatusMonthlyIncomeNumCompaniesWorkedOverTimePercentSalaryHikePerformanceRatingStockOptionLevelTotalWorkingYearsTrainingTimesLastYearYearsAtCompanyYearsInCurrentRoleYearsSinceLastPromotionYearsWithCurrManagerCommunicationSkill
0030Non-TravelResearch & Development23Medical3Female3Laboratory Technician4Single25640No1430122117674
1136Travel_RarelyResearch & Development124Life Sciences3Female3Manufacturing Director3Married46639Yes12327232112
2255Travel_RarelySales21Medical3Male3Sales Executive4Single51604No163012397735
3339Travel_RarelyResearch & Development241Life Sciences1Male3Research Scientist4Single41087No133018277174
4437Travel_RarelyResearch & Development33Other3Male3Manufacturing Director3Married94341No1531102107781
5531Travel_RarelySales74Life Sciences2Male2Sales Representative3Married23293No153013277522
6632Travel_RarelyResearch & Development13Life Sciences4Male2Laboratory Technician3Single37300Yes14304232121
7733Travel_RarelyResearch & Development44Medical1Female2Laboratory Technician2Married38388No11308554025
8835Travel_FrequentlySales112Marketing4Male3Sales Executive4Divorced49681No11315352024
9921Travel_RarelySales71Marketing2Male3Sales Representative2Single26791No13301310105

Last rows

df_indexAgeBusinessTravelDepartmentDistanceFromHomeEducationEducationFieldEnvironmentSatisfactionGenderJobInvolvementJobRoleJobSatisfactionMaritalStatusMonthlyIncomeNumCompaniesWorkedOverTimePercentSalaryHikePerformanceRatingStockOptionLevelTotalWorkingYearsTrainingTimesLastYearYearsAtCompanyYearsInCurrentRoleYearsSinceLastPromotionYearsWithCurrManagerCommunicationSkill
208846038Travel_RarelyResearch & Development152Life Sciences3Male2Research Director4Divorced115100Yes14311231110291
208946143Travel_RarelyResearch & Development142Life Sciences2Male3Research Director1Married171596No244122341102
209046231Travel_RarelySales203Life Sciences2Female1Sales Executive3Married45593Yes11314222221
209146338Travel_RarelyResearch & Development183Medical2Male1Healthcare Representative4Married58113Yes163115210102
209246422Travel_FrequentlyResearch & Development34Life Sciences3Male2Research Scientist4Married28530Yes11311500005
209346532Travel_RarelyResearch & Development24Life Sciences4Male3Laboratory Technician2Single13931No12301210005
209446618Travel_FrequentlySales32Medical2Female3Sales Representative4Single15691Yes12300200002
209546724Travel_RarelyResearch & Development233Medical2Male4Research Scientist4Married27251Yes11326365141
209646831Travel_RarelyResearch & Development233Medical2Male2Healthcare Representative4Married55820No214110290783
209746936Travel_RarelyResearch & Development54Life Sciences2Female3Healthcare Representative1Married80084No12329632025